Team, Visitors, External Collaborators
Overall Objectives
Research Program
Application Domains
Highlights of the Year
New Software and Platforms
New Results
Bilateral Contracts and Grants with Industry
Partnerships and Cooperations
Dissemination
Bibliography
XML PDF e-pub
PDF e-Pub


Section: New Results

Analyzing and Reasoning on Heterogeneous Semantic Graphs

Distributed Artificial Intelligence for Revisable Linked Data Management

Participants : Ahmed El Amine Djebri, Andrea Tettamanzi, Fabien Gandon.

The aim of this PhD thesis is to study and to propose original solutions to many key aspects: Knowledge Representation in case of uncertain, incomplete and reviewable data; Uncertainty Representation in a data source, with provenance; Distributed Knowledge Revision and Propagation; Reasoning over Uncertain, Incomplete and distributed data-sources. Starting from an open Web of Data, this work tries to give the users more objective, exhaustive and certain views and information about their queries, based on distributed data sources with different levels of certainty and trustworthiness. We proposed a vocabulary to formalize uncertainty representation, and a framework to handle uncertainty mapping to sentences and contexts. This work has been presented as a poster at ISWS [68].

Learning Class Disjointness Axioms using Grammatical Evolution

Participants : Thu Huong Nguyen, Andrea Tettamanzi.

The aim of this research is to discover automatically class disjointness axioms from recorded RDF facts on the Web of Data. This may be regarded as a case of inductive reasoning and ontology learning. The instances, represented by RDF triples, play the role of specific observations, from which axioms can be extracted by generalization. We proposed the use of Grammatical Evolution, one type of evolutionary algorithm, for mining disjointness OWL2 axioms from an RDF data repository such as DBpedia. For the evaluation of candidate axioms against the DBpedia dataset, we adopt an approach based on possibility theory. We have submitted a paper to the conference EuroGP 2019.

Semantic Data for Image Recognition

Participants : Anna Bobasheva, Fabien Gandon.

This work is done in the context of the MonaLIA project with French Ministry of Culture, in collaboration with Frédéric Precioso, I3S, UCA. It consists of a preliminary study on image recognition of the Joconde database in connection with semantic data (JocondeLab).

The goal of this project is to exploit the cross-fertilization of recent advances in image recognition and semantic indexing on annotated image databases in order to improve the accuracy and the details of the annotation. The idea is, at first, to assess the potential of machine learning (including deep learning) and the semantic annotations on the Joconde database (350 000 illustrated artwork records from French museums). Joconde also contains metadata based on a thesaurus. In a previous project (JocondeLab) these metadata were formalized in Semantic Web formalism and were linking the iconographic Garnier thesaurus and DBpedia to the data of the Joconde database.

We developed SPARQL queries on Joconde database to extract the subset of images to train the Deep Learning classifier. We identified class subsets with enough labeled images for training, we balance number of images per class and we avoid images with intersected classes.

We tuned the pre-trained VGG16 implementation of the CNN classifier to classify the artwork images using well-known VGG16 with batch normalization [75] to train the classifier for the artwork images. We learned transfer from the training of the network on the ImageNet dataset to decrease the training time and we ran the classifier on many datasets and in different modes.

We developed another set of queries on the metadata to find the dependencies between the classification outcome and the artwork properties by applying statistical methods. We identified the usable (populated enough with reasonable number of categorical values) properties of the metadata. We used Recursive Feature Elimination (RFE) and Decision Tree to identify the top most statistically significant dependent variables and decision splitting values.

Results have been presented at a workshop of Ministry of Culture and Inria, November 22nd, at Bibliothèque Nationale de France in Paris.

Hospitalization Prediction

Participants : Catherine Faron Zucker, Fabien Gandon, Raphaël Gazzotti.

HealthPredict is a project conducted in collaboration with the Département d'Enseignement de Recherche en Médecine Générale (DERMG) at Université Côte d'Azur and the SynchroNext company. It aims at providing a digital health solution for the early management of patients through consultation with their general practitioner and health care circuit. Concretely, it is a predictive Artificial Intelligence interface that allows us to cross the data of symptoms, diagnosis and medical treatments of the population in real time to predict the hospitalization of a patient. The first results of this project will be presented at the French EGC 2019 conference [54]. In this paper, we report and discuss the results of our first experiments on the database PRIMEGE PACA that contains more than 350,000 consultations carried out by 16 general practitioners. We propose and evaluate different ways to enrich the features extracted from electronic medical records with ontological resources before turning them into vectors used by Machine Learning algorithms to predict hospitalization.

Fake News Detection

Participants : Jérôme Delobelle, Elena Cabrio, Serena Villata.

This work is part of the RAPID CONFIRMA (COntre argumentation contre les Fausses InfoRMAtion) DGA project aiming to automatically detect fake news and limit their diffusion. For this purpose, a framework will be developed to detect fake news, to reduce their propagation and to propose the best response strategies.

Thus, in addition to identifying the communities propagating these fake news, we will use methods from Natural Language Processing and Argumentation Theory to propose automatically extracted counter-arguments (adapted to target audience) from the existing reference press articles. These arguments allow to attack the false information detected in the fake news. Argument Mining techniques will make it possible to (1) analyse the argumentation in natural language, for example by looking for the argumentative structures, identifying the relations of support or attack between the arguments; (2) locate the data related to specific information (related to fake news) on the Web.

Mining and Reasoning on Legal Documents

Participants : Cristian Cardellino, Milagro Teruel, Serena Villata.

Together with Cristian Cardellino, Fernando Cardellino, Milagro Teruel and Laura Alonso Alemany from Univ. of Cordoba, we have proposed a methodology to improve argument annotation guidelines by exploiting inter-annotator agreement measures. After a first stage of the annotation effort, we have detected problematic issues via an analysis of inter-annotator agreement. We have detected ill-defined concepts, which we have addressed by redefining high-level annotation goals. For other concepts, that are well-delimited but complex, the annotation protocol has been extended and detailed. Moreover, as can be expected, we showed that distinctions where human annotators have less agreement are also those where automatic analyzers perform worse. Thus, the reproducibility of results of Argument Mining systems can be addressed by improving inter-annotator agreement in the training material. Following this methodology, we are enhancing a corpus annotated with argumentation, available online (https://github.com/PLN-FaMAF/ArgumentMiningECHR) together with guidelines and analyses of agreement. These analyses can be used to filter performance figures of automated systems, with lower penalties for cases where human annotators agree less. This research is addressed in the context of the EU H2020 MIREL project. The results of this research have been published at LREC [59].

Together with some colleagues from Data61 Queensland (Australia) and Antonino Rotolo (University of Bologna), we proposed a formal framework that can instantiate in agents’ dialogues moral/rational criteria, such as the maximin principle, Pareto efficiency, and impartiality, which were used, e.g., by John Rawls’ theory or rule utilitarianism. Most ethical systems define how the individuals ought, morally, act being part of a society. The process of elicitation of a moral theory governing the agents in a society requires them to express their own norms with the aim to find a moral theory on which all may agree upon. This research is addressed in the context of the EU H2020 MIREL project. The results of this research have been published at DEON [57].

Argumentation

Participants : Serena Villata, Andrea Tettamanzi.

In collaboration with Mauro Dragoni of FBK and Célia da Costa Pereira of I3S, we have proposed the SMACk System, combining argumentation and aspect-based opinion mining [14].

Agent-Based Recommender Systems

Participants : Amel Ben Othmane, Nhan Le Thanh, Andrea Tettamanzi, Serena Villata.

We have proposed a spatio-temporal extension for our multi-context framework for agent-based recommender systems (CARS), to which we have added representation and algorithms to manage uncertainty, imprecision, and approximate reasoning in time and space [47].

RDF Mining

Participants : Duc Minh Tran, Andrea Tettamanzi.

In collaboration with Dario Malchiodi of the University of Milan and Célia da Costa Pereira of I3S, we have studied the use of a prediction model as a surrogate of a possibilistic score for OWL axioms [38], [37].

In collaboration with Claudia d'Amato of the University of Bari, we made a comparison of rule evaluation metrics for EDMAR, our evolutionary approach to discover multi-relational rules from ontological knowledge bases exploiting the services of an OWL reasoner [52].